Sampling strategies for information extraction over the deep web

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampling strategies for information extraction over the deep web

Information extraction systems discover structured information in natural language text. Having information in structured form enables much richer querying and data mining than possible over the natural language text. However, information extraction is a computationally expensive task, and hence improving the efficiency of the extraction process over large text collections is of critical intere...

متن کامل

Sampling the National Deep Web

A huge portion of today’s Web consists of web pages filled with information from myriads of online databases. This part of the Web, known as the deep Web, is to date relatively unexplored and even major characteristics such as number of searchable databases on the Web or databases’ subject distribution are somewhat disputable. In this paper, we revisit a problem of deep Web characterization: ho...

متن کامل

Sampling Strategies for Information Goods∗

This paper analyzes optimal decisions concerning the size of the sample and the price of the paid content for online publishers of digital information goods when sampling serves the dual purpose of disclosing content quality and generating advertising revenue. We show in a reduced-form model how the publisher’s optimal ratio of advertising revenue to sales revenue is linked to characteristics o...

متن کامل

Sampling, information extraction and summarisation of Hidden Web databases

Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users’ queries. The majority of these documents are generated through Web page templates, which contain information that is often irrelevant to queries. In this paper, we present a system designed to detect and extract query-related information from documents sampled from database...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Processing & Management

سال: 2017

ISSN: 0306-4573

DOI: 10.1016/j.ipm.2016.11.006